A Monte Carlo Approach to Sequence Assembly

نویسنده

  • Erik Sandelin
چکیده

Motivation: Assembling shotgun sequencing data from repetitive DNA sequences is a non-trivial task. In existing sequence assembly methods repeats are resolved by either using statistical analyses to identify and separate fragments corresponding to repeats, or by using extra information, not contained in the fragments. In this paper we take a different approach. Using the simulated-tempering Monte Carlo method, we resolve repeats by performing an extensive search of the solution space. Results: The method is tested on two highly repetitive sequences with a two-copy and a threecopy repeat, respectively. We find that the method is able to correctly assemble these two sequences, except for a twofold degeneracy for the three-copy repeat sequence. The alternative solution obtained in this case is related by a simple symmetry to the correct one. The performance of the method is compared with that of simulated annealing. We find that simulated tempering is a competitive alternative to simulated To whom correspondence should be addressed

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bayesian Genome Assembly and Assessment by Markov Chain Monte Carlo Sampling

Most genome assemblers construct point estimates, choosing only a single genome sequence from among many alternative hypotheses that are supported by the data. We present a Markov chain Monte Carlo approach to sequence assembly that instead generates distributions of assembly hypotheses with posterior probabilities, providing an explicit statistical framework for evaluating alternative hypothes...

متن کامل

The study of neutron interactions with soft tissue using Monte Carlo simulation using the source PF

The most important part of neutron therapy treatment (NCT1) is to achieve a beam of neutrons with suitable energy and intensity, as well as the least pollution and damage. In this study, in order to correct the neutron spectrum from D-D fusion and its use in neutron therapy, a set of different materials which are called the Beam Shaping Assembly (BSA) was placed in the direction of energy 2.45 ...

متن کامل

Optimal Scheduling of Battery Energy Storage System in Distribution Network Considering Uncertainties using hybrid Monte Carlo- Genetic Approach

This paper proposes a novel hybrid Monte Carlo simulation-genetic approach (MCS-GA) for optimal operation of a distribution network considering renewable energy generation systems (REGSs) and battery energy storage systems (BESSs). The aim of this paper is to design an optimal charging /discharging scheduling of BESSs so that the total daily profit of distribution company (Disco) can be maximiz...

متن کامل

A New Approach for Monte Carlo Simulation of RAFT Polymerization

In this work, based on experimental observations and exact theoretical predictions, the kinetic scheme of RAFT polymerization is extended to a wider range of reactions such as irreversible intermediate radical terminations and reversible transfer reactions. The reactions which have been labeled as kinetic scheme are the more probable existing reactions as the theoretical point of view. The ...

متن کامل

A sequential Monte Carlo EM approach to the transcription factor binding site identification problem

MOTIVATION A significant and stubbornly intractable problem in genome sequence analysis has been the de novo identification of transcription factor binding sites in promoter regions. Although theoretically pleasing, probabilistic methods have faced difficulties due to model mismatch and the nature of the biological sequence. These problems result in inference in a high dimensional, highly multi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2000